Berry–Esseen theorem

The central limit theorem in probability theory and statistics states that under certain circumstances the sample mean, considered as a random quantity, becomes more normally distributed as the sample size is increased. The Berry–Esseen theorem, also known as the Berry–Esseen inequality, attempts to quantify the rate at which this convergence to normality takes place.

Contents

Statement of the theorem

Statements of the theorem vary, as it was independently discovered by two mathematicians, Andrew C. Berry (in 1941) and Carl-Gustav Esseen (1942), who then, along with other authors, refined it repeatedly over subsequent decades.

Identically distributed summands

One version, sacrificing generality somewhat for the sake of clarity, is the following:

Let X1, X2, ..., be i.i.d. random variables with E(X1) = 0, E(X12) = σ2 > 0, and E(|X1|3) = ρ < ∞. Also, let
Y_n = {X_1 %2B X_2 %2B \cdots %2B X_n \over n}
be the sample mean, with Fn the cdf of
{Y_n \sqrt{n} \over {\sigma}},
and Φ the cdf of the standard normal distribution. Then there exists a positive constant C such that for all x and n,
\left|F_n(x) - \Phi(x)\right| \le {C \rho \over \sigma^3\,\sqrt{n}}.\ \ \ \ (1)

That is: given a sequence of independent and identically-distributed random variables, each having mean zero and positive variance, if additionally the third absolute moment is finite, then the cumulative distribution functions of the standardized sample mean and the standard normal distribution differ (vertically, on a graph) by no more than the specified amount. Note that the rate of convergence is on the order of n−1/2.

Calculated values of the constant C have decreased markedly over the years, from the original value of 7.59 by Esseen (1942), to 0.7882 by van Beek (1972), then 0.7655 by Shiganov (1986), then 0.7056 by Shevtsova (2007), then 0.7005 by Shevtsova (2008), then 0.5894 by Tyurin (2009), then 0.5129 by Korolev & Shevtsova (2009), then 0.4785 by Tyurin (2010). The detailed review can be found in the papers Korolev & Shevtsova (2009), Korolev & Shevtsova (2010). The best estimate as of 2011, C<0.4784, follows from the inequality

\sup_x\left|F_n(x) - \Phi(x)\right| \le {0.33477 (\rho%2B0.429\sigma^3)\over \sigma^3\,\sqrt{n}},

due to Korolev & Shevtsova (2010), since σ3≤ρ and 0.33477·1.429<0.4784.

Esseen (1956) proved that the bound must satisfy


    C\geq\frac{\sqrt{10}%2B3}{6\sqrt{2\pi}} \approx 0.40973 \approx \frac{1}{\sqrt{2\pi}} %2B 0.01079 .

Non-identically distributed summands

Let X1, X2, ..., be independent random variables with E(Xi) = 0, E(Xi2) = σi2 > 0, and E(|Xi|3) = ρi < ∞. Also, let
S_n = {X_1 %2B X_2 %2B \cdots %2B X_n \over \sqrt{\sigma_1^2%2B\sigma_2^2%2B\cdots%2B\sigma_n^2} }
be the normalized n-th partial sum. Denote Fn the cdf of Sn, and Φ the cdf of the standard normal distribution. For the sake of convenience denote
\vec{\sigma}=(\sigma_1,...,\sigma_n),\ \vec{\rho}=(\rho_1,...,\rho_n).
In 1941, Andrew C. Berry proved that for all n there exists an absolute constant C1 such that
\sup_x\left|F_n(x) - \Phi(x)\right| \le C_1\cdot\psi_1,\ \ \ \ (2)
where
\psi_1=\psi_1\big(\vec{\sigma},\vec{\rho}\big)=\Big({\textstyle\sum\limits_{i=1}^n\sigma_i^2}\Big)^{-1/2}\cdot\max_{1\le
i\le n}\frac{\rho_i}{\sigma_i^2}.
Independently, in 1942, Carl-Gustav Esseen proved that for all n there exists an absolute constant C0 such that
\sup_x\left|F_n(x) - \Phi(x)\right| \le C_0\cdot\psi_0, \ \ \ \ (3)
where
\psi_0=\psi_0\big(\vec{\sigma},\vec{\rho}\big)=\Big({\textstyle\sum\limits_{i=1}^n\sigma_i^2}\Big)^{-3/2}\cdot\sum\limits_{i=1}^n\rho_i.

It is easy to make sure that ψ0≤ψ1. Due to this circumstance inequality (3) is conventionally called the Berry-Esseen inequality, and the quantity ψ0 is called the Lyapunov fraction of the third order. Moreover, in the case where the summands X1,... Xn have identical distributions

\psi_0=\psi_1=\frac{\rho_1}{\sigma_1^3\sqrt{n}},

and thus the bounds stated by inequalities (1), (2) and (3) coincide.

Regarding C0, obviously, the lower bound established by Esseen (1956) remains valid:


    C_0\geq\frac{\sqrt{10}%2B3}{6\sqrt{2\pi}} = 0.4097\ldots.

The upper bounds for C0 were subsequently lowered from the original estimate 7.59 due to Esseen (1942) to (we mention the recent results only) 0.9051 due to Zolotarev (1967), 0.7975 due to van Beek (1972), 0.7915 due to Shiganov (1986), 0.6379 and 0.5606 due to Tyurin (2009) and Tyurin (2010). As of 2011 the best estimate is 0.5600 obtained by Shevtsova (2010).

See also

References

  • Berry, Andrew C. (1941). "The Accuracy of the Gaussian Approximation to the Sum of Independent Variates". Transactions of the American Mathematical Society 49 (1): 122–136. doi:10.1090/S0002-9947-1941-0003498-3. JSTOR 1990053. 
  • Durrett, Richard (1991). Probability: Theory and Examples. Pacific Grove, CA: Wadsworth & Brooks/Cole. ISBN 0-534-13206-5.
  • Esseen, Carl-Gustav (1942). "On the Liapunoff limit of error in the theory of probability". Arkiv för matematik, astronomi och fysik A28: 1–19. ISSN 0365-4133. 
  • Esseen, Carl-Gustav (1956). "A moment inequality with an application to the central limit theorem". Skand. Aktuarietidskr. 39: 160–170. 
  • Feller, William (1972). An Introduction to Probability Theory and Its Applications, Volume II (2nd ed.). New York: John Wiley & Sons. ISBN 0-471-25709-5.
  • Korolev, V. Yu.; Shevtsova, I. G. (2010). "On the upper bound for the absolute constant in the Berry-Esseen inequality". Theory of Probability and its Applications 54 (4): 638–658. doi:10.1137/S0040585X97984449. 
  • Korolev, Victor; Shevtsova, Irina (2010). "An improvement of the Berry-Esseen inequality with applications to Poisson and mixed Poisson random sums". Scandinavian Actuarial Journal: 1–25. doi:10.1080/03461238.2010.485370. http://www.tandfonline.com/doi/abs/10.1080/03461238.2010.485370. 
  • Manoukian, Edward B. (1986). Modern Concepts and Theorems of Mathematical Statistics. New York: Springer-Verlag. ISBN 0-387-96186-0.
  • Serfling, Robert J. (1980). Approximation Theorems of Mathematical Statistics. New York: John Wiley & Sons. ISBN 0-471-02403-1.
  • Shevtsova, I. G. (2008). "On the absolute constant in the Berry-Esseen inequality". The Collection of Papers of Young Scientists of the Faculty of Computational Mathematics and Cybernetics (5): 101–110. 
  • Shevtsova, I. G. (2007). "Sharpening of the upper bound of the absolute constant in the Berry–Esseen inequality". Theory of Probability and its Applications 51 (3): 549–553. doi:10.1137/S0040585X97982591. 
  • Shevtsova, I. G. (2010). "An Improvement of Convergence Rate Estimates in the Lyapunov Theorem". Doklady Mathematics 82 (3): 862–864. doi:10.1134/S1064562410060062. 
  • Shiganov, I.S. (1986). "Refinement of the upper bound of a constant in the remainder term of the central limit theorem". Journal of Soviet mathematics 35 (3): 109–115. doi:10.1007/BF01121471. 
  • Tyurin, I.S. (2009). "On the accuracy of the Gaussian approximation". Doklady Mathematics 80 (3): 840–843. doi:10.1134/S1064562409060155. 
  • Tyurin, I.S. (2010). "An improvement of upper estimates of the constants in the Lyapunov theorem". Russian Mathematical Surveys 65 (3(393)): 201–202. 
  • van Beek, P. (1972). "An application of Fourier methods to the problem of sharpening the Berry–Esseen inequality". Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete 23 (3): 187–196. doi:10.1007/BF00536558. 
  • Zolotarev, V. M. (1967). "A sharpening of the inequality of Berry–Esseen". Z. Wahrsch. Verw. Geb. 8 (4): 332–342. doi:10.1007/BF00531598. 

External links